-
Notifications
You must be signed in to change notification settings - Fork 65
New debug draw extension for AABBs #900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
public: | ||
struct SCachedCreationParameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one more indent please
#include "nbl/builtin/hlsl/shapes/aabb.hlsl" | ||
#include "nbl/ext/DebugDraw/builtin/hlsl/common.hlsl" | ||
|
||
namespace nbl::ext::debugdraw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either DebugDraw or debug_draw
#ifndef _DRAW_AABB_COMMON_HLSL | ||
#define _DRAW_AABB_COMMON_HLSL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
full header guard, with all namespaces prefixed
#ifdef __HLSL_VERSION | ||
float32_t3x4 transform; | ||
#else | ||
float transform[3*4]; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cpp_compat
gives you float32_t3x4
in C++ too
#ifdef __HLSL_VERSION | ||
float32_t4x4 MVP; | ||
#else | ||
float MVP[4*4]; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again
struct PSInput | ||
{ | ||
float32_t4 position : SV_Position; | ||
float32_t4 color : TEXCOORD0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you using HW attributes for color? the color is per-instance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah these are Vx-Px inter-stage shenanigans, I'll allow, make sure you label color flat though
@@ -0,0 +1,13 @@ | |||
#pragma shader_stage(fragment) | |||
|
|||
#include "common.hlsl" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nbl/ext/DebugDraw/builtin/hlsl/common.hlsl
is preferred
PSInput output; | ||
|
||
float32_t3 vertex = (bda::__ptr<float32_t3>::create(pc.pVertexBuffer) + glsl::gl_VertexIndex()).deref_restrict().load(); | ||
InstanceData instance = vk::RawBufferLoad<InstanceData>(pc.pInstanceBuffer + sizeof(InstanceData) * glsl::gl_InstanceIndex()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try to use vk::BufferPointer
instead of RawBufferLoad
float32_t4x4 transform; | ||
transform[0] = instance.transform[0]; | ||
transform[1] = instance.transform[1]; | ||
transform[2] = instance.transform[2]; | ||
transform[3] = float32_t4(0, 0, 0, 1); | ||
float32_t4 position = mul(transform, float32_t4(vertex, 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a util in math::linalg
to carry out a fast multiple like that, see how I use it in my simpleDebugRenderer's shaders in Examples-Tests sumbouldes' common
folder
transform[2] = instance.transform[2]; | ||
transform[3] = float32_t4(0, 0, 0, 1); | ||
float32_t4 position = mul(transform, float32_t4(vertex, 1)); | ||
output.position = mul(pc.MVP, position); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since the instance data is emphemeral/transient (uploaded just before the draw) can you not pre-multiply the camera MVP with the instance transform on the CPU ?
Such that each instance stores combined MVP*transform
?
// records draw command for single AABB, user has to set pipeline outside | ||
bool renderSingle(video::IGPUCommandBuffer* commandBuffer); | ||
|
||
bool render(video::IGPUCommandBuffer* commandBuffer, video::ISemaphore::SWaitInfo waitInfo, float* cameraMat3x4); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
take the camera MVP as const hlsl::float32_t4x4&
|
||
bool render(video::IGPUCommandBuffer* commandBuffer, video::ISemaphore::SWaitInfo waitInfo, float* cameraMat3x4); | ||
|
||
static std::array<hlsl::float32_t3, 24> getVerticesFromAABB(const core::aabbox3d<float>& aabb); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you need it for?
|
||
static std::array<hlsl::float32_t3, 24> getVerticesFromAABB(const core::aabbox3d<float>& aabb); | ||
|
||
void addAABB(const core::aabbox3d<float>& aabb, const hlsl::float32_t4& color = { 1,0,0,1 }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't provide legacy API with legacy types
void addAABB(const core::aabbox3d<float>& aabb, const hlsl::float32_t4& color = { 1,0,0,1 }); | ||
void addAABB(const hlsl::shapes::AABB<3,float>& aabb, const hlsl::float32_t4& color = { 1,0,0,1 }); | ||
|
||
void addOBB(const hlsl::shapes::AABB<3, float>& aabb, const hlsl::float32_t3x4 transform, const hlsl::float32_t4& color = { 1,0,0,1 }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the transform
shall be the OBB, adding an AABB into the mix is confusing
static bool createStreamingBuffer(SCreationParameters& params); | ||
|
||
std::vector<debugdraw::InstanceData> m_instances; | ||
std::array<hlsl::float32_t3, 24> m_unitAABBVertices; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indexed line list takes 8 verts
static core::smart_refctd_ptr<video::IGPUGraphicsPipeline> createPipeline(SCreationParameters& params); | ||
static bool createStreamingBuffer(SCreationParameters& params); | ||
|
||
std::vector<debugdraw::InstanceData> m_instances; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
immediate mode for GUI is ok, but I'd like to avoid it here, get rid of add
and make render
take a span<const InstanceData>
std::chrono::steady_clock::time_point waitTill = std::chrono::steady_clock::now() + std::chrono::milliseconds(1u); | ||
streaming->multi_allocate(waitTill, 1, &inputOffset, &totalSize, &MaxAlignment); | ||
|
||
memcpy(streamingPtr + vertexByteOffset, m_unitAABBVertices.data(), sizeof(m_unitAABBVertices[0]) * m_unitAABBVertices.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make a small buffer at startup containing the indices (please draw indexed) so you're not copying it in all the time
P.S. the vertices should be a const float32_t3[8]
array in the vertex shader
std::array<float32_t3, 24> DrawAABB::getVerticesFromAABB(const core::aabbox3d<float>& aabb) | ||
{ | ||
const auto& pMin = aabb.MinEdge; | ||
const auto& pMax = aabb.MaxEdge; | ||
|
||
std::array<float32_t3, 24> vertices; | ||
vertices[0] = float32_t3(pMin.X, pMin.Y, pMin.Z); | ||
vertices[1] = float32_t3(pMax.X, pMin.Y, pMin.Z); | ||
vertices[2] = float32_t3(pMin.X, pMin.Y, pMin.Z); | ||
vertices[3] = float32_t3(pMin.X, pMin.Y, pMax.Z); | ||
|
||
vertices[4] = float32_t3(pMax.X, pMin.Y, pMax.Z); | ||
vertices[5] = float32_t3(pMax.X, pMin.Y, pMin.Z); | ||
vertices[6] = float32_t3(pMax.X, pMin.Y, pMax.Z); | ||
vertices[7] = float32_t3(pMin.X, pMin.Y, pMax.Z); | ||
|
||
vertices[8] = float32_t3(pMin.X, pMax.Y, pMin.Z); | ||
vertices[9] = float32_t3(pMax.X, pMax.Y, pMin.Z); | ||
vertices[10] = float32_t3(pMin.X, pMax.Y, pMin.Z); | ||
vertices[11] = float32_t3(pMin.X, pMax.Y, pMax.Z); | ||
|
||
vertices[12] = float32_t3(pMax.X, pMax.Y, pMax.Z); | ||
vertices[13] = float32_t3(pMax.X, pMax.Y, pMin.Z); | ||
vertices[14] = float32_t3(pMax.X, pMax.Y, pMax.Z); | ||
vertices[15] = float32_t3(pMin.X, pMax.Y, pMax.Z); | ||
|
||
vertices[16] = float32_t3(pMin.X, pMin.Y, pMin.Z); | ||
vertices[17] = float32_t3(pMin.X, pMax.Y, pMin.Z); | ||
vertices[18] = float32_t3(pMax.X, pMin.Y, pMin.Z); | ||
vertices[19] = float32_t3(pMax.X, pMax.Y, pMin.Z); | ||
|
||
vertices[20] = float32_t3(pMin.X, pMin.Y, pMax.Z); | ||
vertices[21] = float32_t3(pMin.X, pMax.Y, pMax.Z); | ||
vertices[22] = float32_t3(pMax.X, pMin.Y, pMax.Z); | ||
vertices[23] = float32_t3(pMax.X, pMax.Y, pMax.Z); | ||
|
||
return vertices; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function not needed, you shall store this array in vertex shader and index it with gl_VertexIndex
void DrawAABB::addAABB(const core::aabbox3d<float>& aabb, const hlsl::float32_t4& color) | ||
{ | ||
addAABB(shapes::AABB<3, float>{{aabb.MinEdge.X, aabb.MinEdge.Y, aabb.MinEdge.Z}, { aabb.MaxEdge.X, aabb.MaxEdge.Y, aabb.MaxEdge.Z }}, color); | ||
} | ||
|
||
void DrawAABB::addAABB(const hlsl::shapes::AABB<3,float>& aabb, const hlsl::float32_t4& color) | ||
{ | ||
const auto transform = hlsl::float32_t3x4(1); | ||
addOBB(aabb, transform, color); | ||
} | ||
|
||
void DrawAABB::addOBB(const hlsl::shapes::AABB<3, float>& aabb, const hlsl::float32_t3x4 transform, const hlsl::float32_t4& color) | ||
{ | ||
InstanceData instance; | ||
instance.color = color; | ||
|
||
core::matrix3x4SIMD instanceTransform; | ||
instanceTransform.setTranslation(core::vectorSIMDf(aabb.minVx.x, aabb.minVx.y, aabb.minVx.z, 0)); | ||
const auto diagonal = aabb.getExtent(); | ||
instanceTransform.setScale(core::vectorSIMDf(diagonal.x, diagonal.y, diagonal.z)); | ||
|
||
core::matrix3x4SIMD worldTransform; | ||
memcpy(worldTransform.pointer(), &transform, sizeof(transform)); | ||
|
||
instanceTransform = core::concatenateBFollowedByA(worldTransform, instanceTransform); | ||
memcpy(instance.transform, instanceTransform.pointer(), sizeof(core::matrix3x4SIMD)); | ||
|
||
m_instances.push_back(instance); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather have different constructors or factories of InstanceData
from transform, aabb, color and camera mvp than this immediate mode non-threadsafe (one hidden implicit vector) thing
bool DrawAABB::renderSingle(IGPUCommandBuffer* commandBuffer) | ||
{ | ||
commandBuffer->setLineWidth(1.f); | ||
commandBuffer->draw(24, 1, 0, 0); | ||
|
||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function kinda makes no sense, you'd need to have a separate pipeline which takes the instance data as a push constant
here to draw a single AABB I have to jump through more hoops than calling DrawAABB::render
using offset_t = SCachedCreationParameters::streaming_buffer_t::size_type; | ||
constexpr auto MdiSizes = std::to_array<offset_t>({ sizeof(float32_t3), sizeof(InstanceData) }); | ||
// shared nPoT alignment needs to be divisible by all smaller ones to satisfy an allocation from all | ||
constexpr offset_t MaxAlignment = std::reduce(MdiSizes.begin(), MdiSizes.end(), 1, [](const offset_t a, const offset_t b)->offset_t {return std::lcm(a, b); }); | ||
// allocator initialization needs us to round up to PoT | ||
const auto MaxPOTAlignment = roundUpToPoT(MaxAlignment); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you shall only allocate the instance data in the MDI buffer, it will make all the logic here a lot simpler!
Index buffer must be premade and constant at create
time
No description provided.